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Abstract 

The objective of this article is to formalize the definition of NP 
problems. 

We construct a mathematical model of discrete problems as inde- 
pendence systems with weighted elements. We introduce two auxil- 
iary sets that characterize the solution of the problem: the adjoint set, 
which contains the elements from the original set none of which can be 
adjoined to the already chosen solution elements; and the residual set, 
in which every element can be adjoined to previously chosen solution 
elements. 

In a problem without lookahead, every adjoint set can be generated 
by the solution algorithm effectively, in polynomial time. 

The main result of the study is the assertion that the NP class 
is identical with the class of problems without lookahead. Hence it 
follows that if we fail to find an effective (polynomial-time) solution 
algorithm for a given problem, then we need to look for an alternative 
formulation of the problem in set of problems without lookahead. 

1 Introduction 



Solvability is the key problem in the theory of solution of discrete problems 
JTJ |7J] . The input data and the solution result for any discrete problem are 
usually finite, and generally discrete problems do not suffer from the classical 
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difficulty of total nonexistence of a solution algorithm. Many discrete prob- 
lems have a trivial solution algorithm, which involves exhaustive enumeration 
of the elements of the solution. In practice, however, the trivial algorithm 
is inapplicable, because its computational complexity is too high even for a 
relatively small number of solution elements. Thus, if every discrete problem 
is interpreted as a problem of constructing a subset that satisfies given con- 
straints among the elements of some initial n-set, then the trivial algorithm 
in general requires inspecting all 2 n subsets, which is obviously intractable. 
In such cases, we say that the trivial algorithm runs in exponential time, or 
has exponential complexity. 

In the context of solvability of discrete problems we usually discuss the 
possibility of developing an algorithm that generates a solution in a time 
essentially shorter than the running time of the trivial algorithm. A discrete 
problem is regarded as effectively solvable if the running time of the solution 
algorithm is polynomial in the size of the problem. Such algorithms are called 
polynomial-time algorithms. Here and in what follows, the size of a discrete 
problem is the number n of elements in the input set. 

The difficulties that arise in the process of development of solution algo- 
rithms for various discrete problems have led to the identification of a class 
of problems for which it is expedient to look for effective or polynomial-time 
algorithms. First, all problems for which no solution algorithm exists (e.g., 
solvability of polynomial equations in integers) or for which the number of 
solutions depends exponentially on the size of the problem (e.g., finding all 
2 n ~ 2 covering trees of an n-graph) have been excluded 0, |3| . Among the set 
of discrete problems for which the number of solution elements (the length of 
the solution) is a polynomial function of problem size we focus on problems 
that are solvable by a nondeterministic Turing machine (NTM) in polynomial 
time. Discrete problems satisfying these constraints form the class NP. 

Effective (polynomial-time) algorithms are available for solving some NP 
problems, and we accordingly identify a subclass P C NP of problems with 
polynomial-time algorithms. For many practically important NP problems, 
however, attempts to find effective solution algorithms have failed. 

The issue of discrete-problem solvability thus involves the relationship 
between the class NP and its subclass P. Some authors maintain that a strict 
inclusion applies, i.e., P C NP and P 7^ NP, while others claim that P= NP. 
This disagreement among mathematicians is primarily due to ambiguously 
defined notions of NTM operation. The objective of this article is to formalize 
the definition of NP problems. 



2 



We consider the solution of an individual enumerative NP problem. By 
analyzing the operation of the NTM in the process of solving the problem, we 
establish that different interpretations of NTM operation lead to an ambigu- 
ous description of the process. In one interpretation, the NTM constructs the 
next intermediate solution using only the previously generated computation 
results; the other interpretation ignores this important feature. 

We thus establish that unacceptably large enumeration during problem 
solving arises only when the NTM chooses the next solution element by in- 
specting an exponential number of all (final or support) solutions. The set 
of discrete problems is thus partitioned into two disjoint sets: problems for 
which the next solution element can be chosen without inspecting all the 
support solutions; and problems for which such choice is impossible. Prob- 
lems of the first class are called problems without lookahead, while problems 
of the second class are called inherently exponential. 

We construct a mathematical model of discrete problems as independence 
systems with weighted elements. We introduce two auxiliary sets that char- 
acterize the solution of the problem: the adjoint set, which contains the 
elements from the original set none of which can be adjoined to the already 
chosen solution elements; and the residual set, in which every element can 
be adjoined to previously chosen solution elements. 

In a no-lookahead problem, every adjoint set can be generated by the 
solution algorithm effectively, in polynomial time. 

The main result of the study is the assertion that the NP class is identical 
with the class of problems without lookahead. Hence it follows that if we fail 
to find an effective (polynomial-time) solution algorithm for a given problem, 
then we need to look for an alternative formulation of the problem in the set 
of no-lookahead problems. 

2 Example of a discrete problem 

Consider the acyclic digraph shown in Fig. [I] (a) (in all figures, the arcs 
are directed from bottom up). The transitive closure graph of the acyclic 
digraph is obviously a graph of a strict partial ordering. Any algorithm that 
constructs the maximum matching in a bipartite graph (see, e.g., ||) can be 
applied to partition the nodes of this graph into a minimum number of chains 
(a so-called minimum chain partition, MCP). One of these MCPs contains 
the nodes and the arcs of the transitive closure graph that are shown by thick 
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lines in Fig. [I] (a). 



Xs Xg Xio 




X\ x 2 x 3 

(c) 

Fig. 1: 

It is easy to see that the first chain in the MCP starting from the node 
x\ contains two pairs of independent digraph nodes: xi, x$ and In 
general, the transitive closure graph can have several different MCPs, and 
the transition from one MCP to another is possible if we find an alternating 
cycle or an alternating chain. 

Suppose that in the transitive closure graph for a given MCP it is re- 
quired to find an alternating chain or an alternating cycle that takes us from 
the current MCP to another MCP such that none of the chains contains 
independent nodes of the original graph. 

Is this an NP problem? The solution of this problem - an alternating 
cycle - obviously contains a number of elements that depends linearly on 
the number of nodes of the acyclic digraph. The problem is thus NP if it is 
solvable by NTM in polynomial time. 

According to one interpretation, the NTM operates in two stages 0, [|. § : 
first the machine "guesses" some sequence of solution elements, and then it 
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decides in polynomial time that the guessed sequence is a solution of the given 
problem. It is thus assumed that the guessing stage and the verification 
stage are both executed by the NTM in polynomial time. A key point in 
this interpretation is the feasibility of deciding in polynomial time that the 
solution is correct. 

The correctness of the presented solution - an alternating cycle - obvi- 
ously can be checked in polynomial time for our problem. Therefore, accord- 
ing to this interpretation, the problem is NP. 

Yet there is also an alternative interpretation of NTM operation [p], [5|, |7j . 
In |J, the operation of an NTM is illustrated by the problem of finding a 
correct fc-coloring of some n-graph. There are k n different ways to paint the 
nodes of an n-graph in k colors. It is required to decide if at least one of 
these colorings is a correct coloring. To this end it is obviously sufficient to 
examine all the edges of the colored graph, and if the end points of each edge 
are painted in different colors, then the coloring is correct. 

The number of edges in a graph is of order 0(n 2 ). According to this 
NTM interpretation, we need to check simultaneously all k n colorings, and 
the entire checking procedure is a linear function of the length of the input 
data - the number of elements in the adjacency matrix of the graph. 

Under this interpretation, the main distinction between the operation of 
the NTM and the operation of a deterministic Turing machine (DTM) is 
that the NTM checks concurrently the correctness of all alternatives. Curi- 
ously, however, some authors (see, e.g., [|I|) use both interpretations of NTM 
operation simultaneously. 

Note that if we adopt the second interpretation of NTM operation, then 
the NTM goes from one state to the next only on the basis of previously 
generated computation results. This is also confirmed by simulating the 
operation of an NTM in a DTM with exhaustive enumeration of all compu- 
tations |Q. We know that in each computation step the DTM goes from one 
state to another (and writes appropriate records on the output tape) only on 
the basis of previously generated (intermediate) results. 

Let us consider from this point of view the construction of the alternating 
cycle shown in Fig. [I] (b). (Figure |l] (c) shows the MCP generated by this 
cycle.) Figure |2] (a) shows a part of this cycle, and Fig. ^| (b) shows the 
digraph with the elements of the partition of its transitive closure graph into 
chains generated by an intermediate computation result. 

It is easy to see that the next thin arc (£3,375) of the alternating cycle 
cannot be chosen unless we know in advance that the "thick" arc (x 7 , xg) will 
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Fig. 2: 

subsequently be included in the cycle being constructed. Thus, according to 
the second interpretation of NTM operation, this problem is not NP. 

We have reached a diametrically opposite conclusion to the previous one. 
To eliminate the contradiction, we need to formalize the class of problems 
solvable by NTM in polynomial time. 

The second interpretation of NTM operation is more appealing. Thus, in 
the example of checking the correctness of a graph coloring it is natural to as- 
sume that in each computation step the NTM decides which of the presented 
colorings are correct, and in the next step the decision about new correct 
colorings is reached by analyzing only the new edge in each "correct" option. 
If we adopt the other interpretation, then we have to agree that the NTM 
has an "instantaneous solver" that in each step allows the machine to "look 
ahead" into the required answer and thus decide which of the intermediate 
options is correct and which is not. 

We have previously identified the "uninteresting" class of discrete prob- 
lems that are inherently exponential. The problem of finding an alternating 
chain or cycle should be classified as inherently exponential, because in this 
problem we cannot use a partial (intermediate) result to pass to the next 
"correct" intermediate or final result. In general, this problem requires "in- 
specting" all final results (the number of which is an exponential function of 
the number of graph nodes). 

In the next section we formalize the characteristics of such problems. 
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3 Set-theoretical model of discrete problems 



In set-theoretical terms, many (finite) discrete problems involve selecting 
subsets 7Tj (z = l,m) from some n-set R that satisfy given constraints ||. A 
discrete problem is therefore defined as the 4-tuple Z = (R.Q, M, f), where 
R = {ri, . . . , r n } (n > 1) is called the work set and the feasibility region Q 
is a nonempty collection of subsets 7r of the set R that satisfy the following 
condition: 

1°) if 7r G Q and 7i"i C 7T, then 7i"i G Q. 

The set M = {/i(ri), . . . ,/i(r n )} is a collection of nonnegative integers, 
and / is a function defined on Q. The pair (R, Q) is obviously an indepen- 
dence system. 

Each element n in Q is called a feasible solution of problem Z, and the 
number /i(rj) G M (^(r*) > is the weight of the element G R (i = 1, n). 
In what follows we assume that for every it G Q, 

/(tt) = E 

The solution 7r G Q is called a support solution if there is no tt% G Q such 
that 7r C tti and 7r is a proper subset of the set 7T\. Denote by B C Q t ne se t 
of all support solutions of problem Z. 

Problem Z is called nontrivial if Card(B) = 0(2 P ^). In other words, 
problem Z is nontrivial if the set of its support solutions contains exponen- 
tially many elements. 

Suppose that it is required to find at least one support solution n* G B 
such that /(tt*) takes a specified value. 

In a particular case, problem Z is called extremal if the function f(ir*) 
takes an extremal value, i.e., for a maximization problem f(ir*) > f{n) and 
for a minimization problem f(n*) < /(7r), where ?r 6 Q is any support 
solution of problem Z. 

4 Auxiliary solution sets 

Denote by W(tt) the union of all feasible solutions tti G Q of problem Z each 
of which includes the solution ir e Q, 

W(n)= |J ^. 
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Clearly, W(ir) C R. 

The set S(tt) — R \ W(tt) is called adjoint to the solution tt e Q. Thus, 
an adjoint set consists of those and only those elements of the work set R 
whose union with the given feasible solution tt G Q forms a subset of the set 
R that is not a solution of problem Z. 

Theorem 1 If 111,1x2 G Q and tt\ C ir 2 , then S(tti) C S(tt 2 ). 

Clearly, if m C tt 2 , then W(tt 2 ) C W(tti). Thus 5(tti) = i? \ VP(tti C 

J R\iy(7T 2 ) = 5(7T 2 .0 

The set -R(vr) = i?(7r U S(n)) is called the residual set for the solution 
tt G Q. 

Theorem 2 If it e Q and r G i?(7r) 7^ 0, i/ien it U {r} G Q. 
Indeed, 

fl(7r) = iJ \ (tt U 5(tt)) = (i? \ tt) n (i? \ S(tt)) = 

= (i? \ tt) n W(tt) = (R n W(tt)) \ tt = W(tt) \ TT. 

Therefore, for i?(7r) 7^ 0, the set 7rU{r} is included in at least one feasible 
solution from Q and is thus also a solution by property l°.o 

Theorem 3 If it G Q is a support solution, then R(tt) = and it U S(it) = 
R. 

Let it G Q be a support solution of problem Z. Assume that R(n) 7^ 0. 
This leads to the conclusion that the region Q contains a solution tt\ = 7rU{r} 
(r G R(n)), which properly includes the solution it G Q. A contradiction with 
the definition of support solution. 

The relationship tt U S(tt) = R for every support solution tt G Q follows 
from the definition of residual set when R(tt) — 0.o 
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5 Class of problems without lookahead 



Let T be the set of all possible problems Z. It follows from the above dis- 
cussion that the issue of solvability of problem Z involves developing an 
algorithm (a deterministic Turing machine) that finds at least one support 
solution 7r G B of the problem for which the function f(ir) takes a specified 
value and does it in a time polynomial in the number of elements of the work 
set R. 

In the set T we identify the subclass T\ of problems in which the set of 
support solutions contains exponentially many elements (more precisely, the 
number of elements is a function of the form 2 P ^). Any problem Z G Ti 
is called nontrivial. In what follows we only consider the set of nontrivial 
problems T x C T for which Card(B) = 0(2 p(n 1). 

We say that the adjoint set S(tt) for a given solution 7r is determined 
effectively if for all elements G R \ it we can decide in polynomial time the 
truth of the predicate "7r U {ri} G Q" or the predicate "*7r U {rx} G Q" . 

Problem Z is called a problem without lookahead if for every feasible so- 
lution 7r G Q the adjoint set is determined effectively. 

Theorem 4 If Z G Ti is a no-lookahead problem, then it is solvable by a 
nondeterministic Turing machine in polynomial time. 

Indeed, by definition the size of problem Z is the number of elements n 
in the work set R, and the length of the solution, defined as the number of 
elements in some support solution n C R, is a linear function of n. Noting 
that Z is a no-lookahead problem, the NTM should compute all the feasible 
solutions simultaneously. Hence it follows that every problem Z G T\ is 
solvable by NTM in polynomial time.o 

Theorem 5 The class NP is identical with the class of no-lookahead prob- 
lems, i.e., Ti = NP. 

By Theorem [51 Ti C NP. By description, the class NP does not include 
inherently exponential problems. Thus, Ti = NP.o 

6 Conclusion 

At a first glance it would seem that the accepted interpretation of NTM es- 
sentially restricts the set of NP-complete problems that are considered when 
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the interpretation allows solution "guessing." This is not so, however. We 
know that the formulation of a problem has an essential impact on the possi- 
bility of solving the problem. The accepted interpretation of NTM operation 
makes it possible to reject formulations that a priori require exhaustive enu- 
meration of an exponential set of support solutions. 

Thus, consider the problem to find the Hamiltonian cycle in a graph. 
This is an inherently exponential problem if it is formulated so that the 
construction of each feasible solution requires "guessing" a correct choice, 
i.e., "advance knowledge" of the collection of edges that forms a feasible 
solution, or belongs to at least one support solution - a Hamiltonian cycle. 

The same problem can be formulated in a different setting: find a parti- 
tion of the graph into a minimum number of cycles and edges. If the graph 
is Hamiltonian, then the solution of this problem produces the sought cycle. 
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